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Abstract 



In this paper we address the complexity of solving linear programming problems with a set of 
differential equations that converge to a fixed point that represents the optimal solution. Assuming a 
probabilistic model, where the inputs arc i.i.d. Gaussian variables, we compute the distribution of the 
convergence rate to the attracting fixed point. Using the framework of Random Matrix Theory, we 
derive a simple expression for this distribution in the asymptotic limit of large problem size. In this 
limit, we find the surprising result that the distribution of the convergence rate is a scaling function of 
a single variable. This scaling variable combines the convergence rate with the problem size (i.e., the 
number of variables and the number of constraints). We also estimate numerically the distribution of 
the computation time to an approximate solution, which is the time required to reach a vicinity of the 
attracting fixed point. Wc find that it is also a scaling function. Using the problem size dependence 
of the distribution functions, we derive high probability bounds on the convergence rates and on the 
computation times to the approximate solution. 



Keywords: Theory of Analog Computation, Dynamical Systems, Linear Program- 
ming, Scaling, Random Matrix Theory. 
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1 Introduction 



In recent years scientists have developed new approaches to computation, some of them 
based on continuous time analog systems. Analog VLSI devices, that are often described by 
differential equations, have applications in the fields of signal processing and optimization. 
Many of these devices are implementations of neural networks 121 El) or the so-called 
neuromorphic systems 0| which are hardware devices whose structure is directly motivated 
by the workings of the brain. In addition there is an increasing number of algorithms based 
on differential equations that solve problems such as sorting (31 , linear programming jSl and 
algebraic problems such as singular value decomposition and finding of eigenvectors (see 
[71 and references therein). On a more theoretical level, differential equations are known 
to simulate Turing machines |H1. The standard theory of computation and computational 
complexity 1^1 deals with computation in discrete time and in a discrete configuration space, 
and is inadequate for the description of such systems. This work may prove useful in the 
analysis and comparison of analog computational devices (see e.g. pi ITU]). 

In a recent paper we have proposed a framework of analog computation based on ODE's 
that converge exponentially to fixed points In such systems it is natural to consider 
the attracting fixed point as the output. The input can be modeled in various ways. One 
possible choice is the initial condition. This is appropriate when the aim of the computa- 
tion is to decide to which attractor out of many possible ones the system flows (see 
The main problem within this approach is related to initial conditions in the vicinity of 
basin boundaries. The flow in the vicinity of the boundary is slow, resulting in very long 
computation times. Here, as in TT" the parameters on which the vector field depends are 
the input, and the initial condition is part of the algorithm. This modeling is natural for 
optimization problems, where one wishes to find extrema of some function E{x), e.g. by a 
gradient flow x = grad£'(x). An instance of the optimization problem is specified by the 
parameters of E{x), i.e. by the parameters of the vector field. 

The basic entity in our model of analog computation is a set of ODEs 



where x is an n-dimensional vector, and F is an n-dimensional smooth vector field, which 
converges exponentially to a fixed point. Eq. solves a computational problem as follows: 
Given an instance of the problem, the parameters of the vector field F are set, and it is 
started from some pre-determined initial condition. The result of the computation is then 
deduced from the fixed point that the system approaches. 

Even though the computation happens in a real configuration space, this model can be 
considered as either a model with real inputs, as for example the BSS model {13] , or as a 
model with integer or rational inputs, depending what types of values the initial conditions 
are given. In it was argued that the time complexity in a large class of ODEs is the 
physical time that is the time parameter of the system. The initial condition there was 
assumed to be integer or rational. In the present paper, on the other hand, we consider real 
inputs. More specifically, we will analyze the complexity of a flow for linear programming 
(LP) introduced in In the real number model the complexity of solving LP with interior 
point methods is unbounded ^31 , and a similar phenomenon occurs for the flow we analyze 
here. To obtain finite computation times one can either measure the computation time in 




(1) 
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terms of a condition number as in ^S], or impose a distribution over the set of LP instances. 
Many of the probabihstic models used to study the performance of the simplex algorithm 
and interior point methods assume a Gaussian distribution of the data j^lElEl) and 
we adopt this assumption for our model. Recall that the worst case bound for the simplex 
algorithm is exponential whereas some of the probabilistic bounds are quadratic 

Two types of probabilistic analysis were carried out in the LP literature: average case 
and "high probability" behavior |19|l2(H l^. A high probability analysis provides a bound on 
the computation time that holds with probability 1 as the problem size goes to infinity j21j . 
In a worst case analysis interior point methods generally require 0(-y/n| log e|) iterations 
to compute the cost function with e-precision, where n is the number of variables j2Uj . 
The high probability analysis essentially sets a limit on the required precision and yields 
0{^/nlogn) behavior However, the number of iterations has to be multiplied by the 
complexity of each iteration which is C(n^), resulting in an overall complexity 0{n^'^ logn) 
in the high probability model [221 • The same factor per iteration appears in the average 
case analysis as well [T9] . 

In contrast, in our model of analog computation, the computation time is the physical 
time required by a hardware implementation of the vector field F(x) to converge to the 
attracting fixed point. We need neither to follow the flow step-wise nor to calculate the 
vector field F(x) since it is assumed to be realized in hardware and does not require repetitive 
digital approximations. As a result, the complexity of analog processes does not include 
the 0{n^) term as above, and in particular it is lower than the digital complexity of interior 
point methods. In this set-up we conjecture, based on numerical calculations, that the 
flow analyzed in this paper has complexity O (n log n) on average and with high probability. 
This is higher than the number of iterations of state of the art interior point methods, but 
lower than the overall complexity ©(n^'^logn) of the high probability estimate mentioned 
above, which includes the complexity of an individual operation. 

In this paper we consider a flow for linear programming proposed by Faybusovich 
for which F{x) is given by (jlj). Substituting (0]) into the general equation (P) we obtain ©, 
which realizes the Faybusovich algorithm for LP. We consider real inputs that are drawn 
from a Gaussian probability distribution. For any feasible instance of the LP problem, the 
flow converges to the solution. We consider the question: Given the probability distribution 
of LP instances, what is the probability distribution of the convergence rates to the solution? 
The convergence rate measures the asymptotic computation time: the time to reach an e 
vicinity of the attractor, where e is arbitrarily small. The main result of this paper, as 
stated in Theorem (|4.1|) . is that with high probability and on the average, the asymptotic 
computation time is O {\/n\ log e|), where n is the problem size and e is the required precision 
(see also Corollary (jS.!!) ). 

In practice, the solution to arbitrary precision is not always required, and one may need 
to know only whether the flow (pQ) or © has reached the vicinity of the optimal vertex, 
or which vertex out of a given set of vertices will be approached by the system. Thus, 
the non-asymptotic behavior of the flow needs to be considered jllj . In this case, only a 
heuristic estimate of the computation time is presented, and in Section 6 we conjecture that 
the associated complexity is O (nlogn), as mentioned above. 

The rest of the paper is organized as follows: In section 2 the Faybusovich flow is 
presented along with an expression for its convergence rate. The probabilistic ensemble of 
the LP instances is presented in section 3. The distribution of the convergence rate of this 
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flow is calculated analytically in the framework of random matrix theory (RMT) in section 
4. In secton 5 we introduce the concept of "high-probability behavior" and use the results of 
section 4 to quantify the high-probability behavior of our probabilistic model. In section 6 
we provide measures of complexity when precision asymptotic in e is not required. Some of 
the results in sections 6.2-8 are heuristic, supported by numerical evidence. The structure 
of the distribution functions of parameters that control the convergence is described in 
section 7 and its numerical verification is presented in section 8. Finally, the results of this 
work and their possible implications are discussed in section 9. Some technical details are 
relegated to the appendices. Appendix A contains more details of the Faybusovich flow. 
Appendix B exposes the details of the analytical calculation of the results presented in 
section 4, and appendix C contains the necessary details of random matrix theory relevant 
for that calculation. 

2 A flow for linear programming 

We begin with the definition of the linear programming problem (LP) and a vector field for 
solving it introduced by Faybusovich in |6j . The standard form of LP is to find 

max{c^x : xe'\R"',Ax = b,x>0} (2) 

where c G R", b G R™, A G 11*"^" and m < n. The set generated by the constraints in ^ 
is a polyheder. If a bounded optimal solution exists, it is obtained at one of its vertices. 
Let B C {1, . . . , n}, \B\ = m, and J\f = {1, . . . ,n} \ B, and denote by xq the coordinates 
with indices from B, and by Ag, the m x m matrix whose columns are the columns of A 
with indices from B. A vertex of the LP problem is defined by a set of indices B, which is 
called a basic set, if 

= A^^b > . (3) 

The components of a vertex are that satisfy ©, and xj\f = 0. The set A/" is then called 
a non-basic set. Given a vector field that converges to an optimal solution represented by 
basic and non-basic sets B and M, its solution x{t) can be decomposed as {xj\f{t),xi3{t)) 
where xj^{t) converges to 0, and xsit) converges to A^^b. 

In the following we consider the non-basic set M = {1, . . . , n — m}, and for notational 
convenience denote the m x m matrix by B and denote Aj^ by A^, i.e. A = (N, B). 

The Faybusovich vector field is a projection of the gradient of the linear cost function 
onto the constraint set, relative to a Riemannian metric which enforces the positivity con- 
straints X > 6 . Let h{x) = c^x. We denote this projection by grad/i. The explicit form 
of the gradient is: 

grad h{x) = [X - XA^{AXA^)-^AX] c , (4) 
where X is the diagonal matrix Diag(xi . . . x„). It is clear from (jlj) that 

Agrad h{x) = . 

Thus, the dynamics 

— =grad/i(x) (5) 
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preserves the constraint Ax = 6 in Thus, the faces of the polyheder are invariant sets 
of the dynamics induced by grad h. Furthermore, it is shown in [S] that the fixed points 
of grad h coincide with the vertices of the polyheder, and that the dynamics converges 
exponentially to the maximal vertex of the LP problem. Since the formal solution of the 
Faybusovich vector field is the basis of our analysis we give its derivation in Appendix A. 

Solving (jSJ requires an appropriate initial condition - an interior point in this case. This 
can be addressed either by using the "big-M" method ^Tl, which has essentially the same 
convergence rate, or by solving an auxiliary linear programming problem We stress 
that here, the initial interior point is not an input for the computation, but rather a part 
of the algorithm. In the analog implementation the initial point should be found by the 
same device used to solve the LP problem. 

The linear programming problem ((2)) has n — m independent variables. The formal 
solution shown below, describes the time evolution of the n — m variables xj\f(t), in terms of 
the variables xis{t). When M is the non-basic set of an optimal vertex of the LP problem, 
xj\f{t) converges to 0, and xts{t) converges to b. Denote by e^, . . . the standard basis 
of M", and define the n — m vectors 

m 

^i = e' + J2 ajie^ , (6) 
i=i 

where 

aji = -{B-'N)ji (7) 

is an m X (n — m) matrix. The vectors ^* are perpendicular to the rows of A and are 
parallel to the faces of the polyheder defined by the constraints. In this notation the 
analytical solution is (see Appendix 1^ : 

x,{t) = Xi{0) exp 1 -A,t - f; a,, log ] , i g AA = {1, . . . , n - m} (8) 

\^ p{ Xj+n-m{0) J 

where Xj(0) and 

•^j+n— m(0) 8-1^6 Components of the initial condition, Xjj^ri—m 

(t) are the xg 

components of the solution, and 

m 

Ai = - < f/,c >= -Ci - ^ Cjaji (9) 

i=i 

(where < ., . > is the Euclidean inner product). 

An important property which relates the signs of the Aj and the optimality of the 
partition of A (into {B, N)) relative to which they were computed is now stated: 

Lemma 2.1 [0] For a polyhedron with {n — m+l, . . . , n}, a basic set of a maximum vertex, 

Aj>0 i = l,...,n — m. 

The converse statement does not necessarily hold. The Aj are independent of b. Thus we 
may have that all Aj are positive, and yet the constraint set is empty. 

Remark 2.1 Note that the analytical solution is only a formal one, and does not provide 
an answer to the LP instance, since the Aj depend on the partition of A, and only relative 
to a partition corresponding to a maximum vertex are all the Aj positive. 
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The quantities Aj are the convergence rates of the Faybusovich flow, and thus measure 
the time required to reach the e- vicinity of the optimal vertex, where e is arbitrarily smah: 



T,~^, (10) 

where 

Amm = minAi. (11) 

Therefore, if the optimal vertex is required with arbitrary precision e , then the computation 
time (or complexity) is O |^A~|„| logej^ 

In summary, if the Aj are small then large computation times will be required. The 
Aj can be arbitrarily small when the inputs are real numbers, resulting in an unbounded 
computation time. However, we will show that in the probabilistic model, which we de- 
fine in the next section, "bad" instances are rare, and the flow performs well "with high 
probability" (see Theorem (|4.H) and Corollary (|5.1|1 ). 



3 The probabilistic model 

We now define the ensemble of LP problems for which we analyze the complexity of the 
Faybusovich fiow. Denote by A^(0, cj^) the standard Gaussian distribution with mean 
and variance cr^. Consider an ensemble in which the components of {A,h,c) are i.i.d. 
(independent identically distributed) random variables with the distribution A^(0, cj'^). The 
model will consist of the following set of problems: 

LPM = {{A,b,c) I (^, 6, c) are i.i.d. variables with the distribution iV(0,cj^) (12) 

and the LP problem has a bounded optimal solution} . 

Therefore, we use matrices with a distribution A^(0, a"^): 

/(^) = J-e,p(-^trA-A) (13) 

with normalization 

Za = j (T^'A exp (^--LtrA^A^ = (2^a2) '""^' . (14) 

The ensemble H13() factorizes into ran i.i.d. Gaussian random variables for each of the 
components of A. 

The distributions of the vectors c and b are defined by: 

with normalization 

Z, = j (Tc exp (^-^c^c) = (2tt(j^Y''' , (16) 
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and 




with normalization 




With the introduction of a probabihstic model of LP instances A^[^ becomes a random 
variable. We wish to compute the probability distribution of Amin for instances with a 
bounded solution, when A^^^ > 0. We reduce this problem to the simpler task of computing 
V{Amin > A\Amin > 0), in which the condition A^m > is much easier to impose than 
the condition that an instance produces an LP problem with a bounded solution. This 
reduction is justified by the following lemma: 

Lemma 3.1 

^(Ainin > A|LP instance has a bounded maximum vertex) = (19) 

^(Ajnin 

> A|Amin > 0). 

Proof. Let {A, b, c) be an LP instance chosen according to the probability distributions 
(|13() . H15|) and H17() . There is a unique orthant (out of the 2" orthants) where the constraint 
set Ax = b defines a nonempty polyheder. This orthant is not necessarily the positive 
orthant, as in the standard formulation of LP. 

Let us consider now any vertex of this polyheder with basic and non-basic sets B and AA. 
Its m non-vanishing coordinates are given by solving Aqxq = b. The matrix Ais is full 
rank with probability 1; also, the components of are non-zero and finite with probability 
1. Therefore, in the probabilistic analysis we can assume that x/s is well defined and non- 
zero. With this vertex we associate the n — m quantities Aj = —{cj^)i + {cqA~^^ Aj^)i., from 
®. 

We now show that there is a set of 2™ equiprobable instances, which contains the in- 
stance {A, b, c), that shares the same vector b and the same values of {Aj}, when computed 
according to the given partition. This set contains a unique instance with in the positive 
orthant. Thus, if A^m > 0, the latter instance will be the unique member of the set which 
has a bounded optimal solution. 

To this end, consider the set 1Z{xb) of the 2™- reflections QiXb of xg, where Qi is an 
m X m diagonal matrix with diagonal entries ±1 and / = 1, 2, 2*". 

Given the instance {A, b, c) and a particular partition into basic and non-basic sets, we 
split A columnwise into {Ais,Aj\f) and c into {cts,cj\f). Let S be the set of 2"^ instances 
{{AbQi, Aj\f),b, {QiCi3,Cf>j-)) where / = 1,...,2™. The vertices Q/Xg of these instances, 
which correspond to the prescribed partition, comprise the set TZ{xs), since {A]sQi){QiX]s) = 
b. Furthermore, all elements in TZ{xts) (each of which corresponds to a different instance) 
have the same set of A's, since Aj = —{cj\f)i + [{Qicb)^ {AbQi)~^ Aj^]i. Because of the 
symmetry of the ensemble under the reflections Qi, the probability of all instances in S is 
the same. 

All the vertices belonging to 7^(xg) have the same Aj's with the same probability, and 
exactly one is in the positive orthant. Thus, if Amin > 0, the latter vertex is the unique 
element from S which is the optimal vertex of an LP problem with a bounded solution. 
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Consequently, the probability of having any prescribed set of Aj's, and in particular, the 
probability distribution for the Aj's given Amj„ > 0, is not affected by the event that the 
LP instance has a bounded optimal solution (i.e., that the vertex is in the positive orthant). 
In other words, these are independent events. Integration over all instances and taking this 
way into account all possible sets S while imposing the requirement {Amin > A|Ajniii > 0} 
results in I 

The event Amin > corresponds to a specific partition of A into basic and non-basic 
sets B,Af, respectively. It turns out that it is much easier to analytically calculate the 
probability distribution of A^m for a given partition of the matrix A. It will be shown 
in what follows that in the probabilistic model we defined, 'P(Ajnin > ^l^min > 0) is 
proportional to the probability that Amin > A for a fixed partition. Let Wj be the event 
that a partition j of the matrix A is an optimal partition, i.e. all Aj are positive (j is an 
index with range 1, . . . , (^)). Let the index 1 stand for the partition where B is taken from 
the last m columns of A. We now show: 

Lemma 3.2 Let A > then 

> 0) =V{A^in > A\Wi) . 

Proof. Given that Amin > 0, there is a unique optimal partition since a non-unique optimal 
partition occurs only if c is orthogonal to some face of the polyhedron, in which case Aj = 
for some i. Thus we can write: 

V{A^in> A\A^in>0) = 5]P(A„,„> A|A„i„>0,l^_,)P(I^j) (20) 

j 

= Y.'P(^rrnn> A\Wj)V{Wj) , (21) 

j 

where the second equality holds sinc6 the event is contctined. in the event that ^^nin 

> 0. 

The probability distribution of {A, c) is invariant under permutations of columns of A and 
c, and under permutations of rows of A. Therefore the probabilities V{Wj) are all equal, 
and so are V{Amin > A\Amin > 0, Wj), and the result follows. 

■ 

We define 

Amini = min{Ai | Aj are computed relative to the partition 1} (22) 

Note that the definition of Amin in equation is relative to the optimal partition. To 
show that all computations can be carried out for a fixed partition of A we need the next 
lemma: 



Lemma 3.3 Let A > then 

ViAminl > A) 



> A|Amin > 0) 



V{Arrnnl > 0) 
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Proof. The result follows from 



> 0), (23) 

combined with the result of the previous lemma and the definition of conditional probability. 

I 

In view of the symmetry of the joint probability distribution (j.p.d.) of Ai, . . . , A„_m, 
given by and 1)32(1 . the normalization constant V{Amini > 0) satisfies: 

P(A™„i > 0) = 1/2"-™ . (24) 

Remark 3.1 Note that we are assuming throughout this work, that the optimal vertex is 
unique, i.e., given a partition {M,I3) of A that corresponds to an optimal vertex, the basic 
components are all non-zero. The reason is that if one of the components of the optimal 
vertex vanishes, all of its permutations with the n — m components of the non-basic set 
result in the same value of c^x. Vanishing of one of the components of the optimal vertex 
requires that 6 is a linear combination of columns of A, that is an event of zero measure in 
our probabilistic ensemble. Therefore this case will not be considered in the present work. 



4 Computing the distributions of A^^^^i and of A 



mm 



In the following we compute first the distribution of Amini and use it to obtain the distri- 
bution of Amin via Lemma ()3.3() . We denote the first n — m components of c by y, and its 
last m components by z. In this notation equation Q for Aj takes the form: 



Ap = -yp + {z^B '^N)p p = l,...,n-m. 



Our notation will be such that indices 

i,j,k,... range over 

and 



1,2, 



, m 



p,q, . . . range over 1, 2, . . . , n — m. 



In this notation, the ensembles (fT3|) and (fT5|) may be written as 



f{A) = f(N,B) = —exp 



/(c) = /(y,2;) = — exp 



(25) 



, A. 



We first compute the joint probability distribution (j.p.d.) of Ai, . . . , ^„_m 
the partition 1. This is denoted by /i(Ai, . . . , A„_m)- Using (23), we write 



(26) 



relative to 



/l(Ai, . . . , An-ra) 



^ ^m{n-m) f^^ra^^n-vay 



fiN,B)f{y,z) II 5\Ag + yg- z,{B-^)^iNi, , (27) 



9=1 
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where 5{x) is the Dirac delta function. We note that this j.p.d. is not only completely 
symmetric under permuting the Ap's, but is also independent of the partition relative to 
which it is computed. 

We would like now to perform the integrals in H27|) and obtain a more explicit expression 
for /i(Ai, . . . , An-m)- It turns out that direct integration over the y^'s, using the 6 function, 
is not the most efficient way to proceed. Instead, we represent each of the 6 functions as a 
Fourier integral. Thus, 



/i(Ai,...,A, 



jn—m\ 

Bd'^^''-"''^Nd"'zd''-"'y f{N,B)f{y,z) 



■ exp 



(27r)" 

Ag I Ag + - ^ Zj{B-^)jiN, 



Integration over N^p, Xg and Up is straight forward and yields 



/i(Ai,...,A„_ 



■ exp 



V A2 

zT{BTB)-^z+l 



(28) 



Here the complete symmetry of /i(Ai, . . . , A„_m) under permutations of the Ap's is ex- 
plicit, since it is a function of Yl,p ^p- 

The integrand in (|28() contains the combination 



u{B,z) 



zT(BTB)~^z + l' 



(29) 



Obviously, < u{B,z) < 1. It will turn out to be very useful to consider the distribution 
function P{u) of the random variable u = u{B, z), namely. 



P{u) 



27rcr2 



5 u 



zT(BTB)~^z + l 



■ (30) 



Note from (|29|) that u{XB,Xz) = u{B,z). Thus, in fact, P{u) is independent of the 
(common) variance a of the Gaussian variables B and z, and we might as well rewrite ()3U() 
as 



P{u) 



m -\-m 
2 



1 



z^iB^By^z + lJ ' 



(31) 



with A > an arbitrary parameter. 

Thus, if we could calculate P{u) explicitly, we would be able to express the j.p.d. 
/i(Ai, . . . , An-m) in (|28j) in terms of the one dimensional integral 



/i(Ai,...,A, 



1 



27rfT" 



n — m CXD 
2 



ri. — ffi 

du Piu) u 2 exp 



1 

'2^ 



p=i 



(32) 



as can be seen by comparing (|28j) and (|3U|1. 
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In this paper we are interested mainly in the minimal A. Thus, we need fminii^), the 
probability density of Amini- Due to the symmetry of /i(Ai, . . . , A„,_m)) which is explicit 
in ((SH), we can express /mini (A) simply as 

oo 

/i(A,A2,..., (33) 

A 

It will be more convenient to consider the complementary cumulative distribution (c.c.d.) 

oo 

Q(A) = V{Aminl >^) = J fminl{u)du , (34) 

A 

in terms of which 

/™„i(A) = -^Q(A) . (35) 
The c.c.d. Q(A) may be expressed as a symmetric integral 

oo 

Q(A) = I dAi ...dA /i(Ai,A2,...,A (36) 

A 

over the A's, and thus, it is computationally a more convenient object to consider than 

fminl (A) . 

From (jS^ and (jS^ we obtain that 

n-m OO / OO \ n-m 

Q(A)= (^^) ' I duP{u) I d^e-5^™M , (37) 

\ A / 

and from (|37)1 one readily finds that 

2(0) = ^ ' (38) 

(as well as Q(— oo) = 1, by definition of Q). 
Then, use of the integral representation 

oo 

1 - erf (x) = erfc(x) = ^ dv e"''" , (x > 0) (39) 

X 

and (jSHl) leads (for A > 0) to 

OO 

Q(A) = Q(0) J du P{u) [eric [A ] j ■ (40) 



This expression is an exact integral representation of Q(A) (in terms of the yet undetermined 
probability distribution P{u)). 
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In order to proceed, we have to determine P{u). Determining P{u) for any pair of 
integers (n, m) in pij) in a closed form is a difficult task. However, since we are interested 
mainly in the asymptotic behavior of computation times, we will contend ourselves in 
analyzing the behavior of P{u) as n,m —> oo, with 



r = m/n < 1 



(41) 



held fixed. 

We were able to determine the large n, m behavior of P{u) (and thus of /i(Ai, A2, . . . , A„. 
and Q(A)) using standard methods |23[ I24j of random matrix theory \25\ . 

This calculation is presented in detail in Appendix B. We show there (see Eq. H125() ) 
that the leading asymptotic behavior of P{u) is 



P{u) 



Tfl vnu 

e 2 

2?™ 



(42) 



namely, ^/u is simply a Gaussian variable, with variance proportional to 1/ \/m. Note that 
(|42|) is independent of the width a, which is consistent with the remark preceding (|31|). 
Substituting in ((211), we obtain, with the help of the integral representation 



T{z) = J t'-^e-'dt 


of the r function, the large n, m behavior of the j.p.d. /i(Ai, . . . , A„- 

'n — m+l\/l 1 



(43) 



/i(Ai 



maV 



as 



2 



^vr mcr2 + Ep A2^ 



(44) 



Thus, the A's follow asymptoticly a multi-dimensional Cauchy distribution. It can be 
checked that 1)44(1 is properly normalized to 1. 

Similarly, by substituting (|42l) in (|4U|) . and changing the variable to y = \/mu/2, we 
obtain the large n, m behavior of Q(A) as 



Q(A) = ^ / H 




A 



ma 



(45) 



As a consistency check of our large n, m asymptotic expressions, we have verified, with the 
help of (jini), that substituting (gH) into leads to (jlOl), with P(ti) there given by the 
asymptotic expression 

We are interested in the scaling behavior of Q(A) in l|45() in the limit n, m ^ 00. In this 

-m 

in (|45|) decays rapidly to zero. Thus, the 



large n, m limit, the factor {evic 



A 



y/rao 



integral in (|45|) will be appreciably different from zero only in a small region around A = 0, 



where the erfc function is very close to 1. More precisely, using erfcx = 1 — + £>( 



we may expand the erfc term in ()45() as 

V 



erfc 



A 



mcr 



2yA 



+ • 



(46) 



13 



(due to the Gaussian damping factor in H45|) . this expansion is uniform in y). Thus, we see 
that Q(A)/Q(0) win be appreciably different from zero only for values of A/o" of the order 
up to l/^/m, for which exponentiates into a quantity of and thus 

oo 

Q(A) ^ J dye-y' exp ("^ " l) , (47) 



where 

5 = ^ (48) 
a 

is 0{rr^\ Note that m/n is kept finite and fixed. The integral in H47|) can be done, and 
thus we arrive at the explicit scaling behavior of the c.c.d. 

Q(A) = Q(0) e^i erfc(xA) , (49) 

where 

a^A = riA{n,m)A, (50) 

with 



r/A(n,m) = ^f^-l) ^. (51) 

The c.c.d. Q(A) depends, in principle, on all the three variables n, m and A. The result 
H49|) demonstrates, that in the limit {n,m) — > oo (with r = m/n held finite and fixed), 
Q(A) is a function only of one scaling variable: the xa defined in (|5fl|) . 

We have compared (|49jl and (|5U|) against results of numerical simulations, for various 
values of n/m. The results are shown in Figures 2 and 3 in Section 8. 

Establishing the explicit scaling expression of the probability distribution of the conver- 
gence rate constitutes the main result in our paper, which we summarize by the following 
Theorem: 

Theorem 4.1 Assume that LP problems of the form (0), with the instances distributed 
according to U,'^) - I[T^) . are solved by the Faybusovich algorithm Then, in the asymptotic 
limit n oo, m ^ oo with < r = m/n < 1 kept fixed, the convergence rate Amin defined 
by a 11]^ is distributed according to 

2 

V{Amin > A\bounded optimal solution) = e^^ erfc(xA) , (52) 
where xa is given by \5Uil . 

Proof. Q(A) = V{Amini > A) by (El). Therefore, use of ((SH) and (|SH1), namely, 

V{A^,ini > 0) = Q(0) = 1/2— 

, and of implies 

V{Am > A) = e^i erfc(a;A) , (53) 

but according to Lemma (|3.3|) . 

P(A„i„i > A) 



P(Amin > A|A„,i„ > 0) 



V{Arainl > 0) 
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Finally, substituting and pi]) in the last equation, and use of Lemma leads to 

the statement of the theorem. I 



From (|49l) and (|5U|) . we can obtain the probability density /mini (A) of Amini, using 
H35|) . In particular, we find 

/™ni(0) = ^ f--l)Q(O), (54) 

which coincides with the expression one obtains for /mmi(O) by directly substituting the 
large {n,m) expression into without first going to the scaling regime A ~ l/^/rn, 
where (|49j) holds. 



5 High-probability behavior 

In this paper we show that the Faybusovich vector field performs well with high probability, 
a term that is explained in what follows. Such an analysis was carried out for interior point 
methods e.g. in |2H I26j. When the inputs of an algorithm have a probability distribution, 
Ajjiin becomes a random variable. High probability behavior is defined as follows: 

Definition 5.1 Let T„ be a random variable associated with problems of size n. We say 
that T(n) is a high probability bound on r„ if for n — > cxd T„ < T(n) with probability one. 

To show that 1/Amin < ri{m) with high probability is the same as showing that A^in > 
l/rj{m) with high probability. Let /j^^(A|Amin > 0) denote the probability density of Amin 
given Ajnin > 0. The m superscript is a mnemonic for its dependence on the problem size. 
We make the following observation: 

Lemma 5.1 Let V{A^[^ > x|Amin > 0) be analytic in x around x = 0. Then, Amin > 
/j^;^ (0|Amin > 0)g{m) with high-probability, where g{m) is any function such that 
im^-^oo 5'(m-) = oo. 

Proof. For very small x we have: 

V{A^in > x|A^i„ > 0) « 1 - /2(0|A^i„ > 0)x . (55) 

We look for x = x{m) such that 'P(Amin > x(m)|Amin > 0) = 1 with high probability. For 
this it is sufficient that 

lim /2(0|A^in > O)x(m) = (56) 

This holds if 

= [/iTn (Ol^min > 0)9{m)] , (57) 
where g{m) is any function such that limm_^oo 5'(^7^) = oo. I 

The growth of g{m) can be arbitrarily slow, so from this point on we will ignore this factor. 
As a corollary to Theorem 1)4. 1|) and H54|) we now obtain: 
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Corollary 5.1 Let {A,b,c) be linear programming instances distributed according to p2() 
then 

-J— = 0{m^/^) and T, = 0{m^/^) (58) 

^min 

with high probability. 

Proof. According to the results of Section 4, (and more explicitly, from the derivation of 
in Section 7), /i7n(0| ^min > 0) ~ m^", and the result follows from lemma (|5.1|) and 
the definition of (equation ((TU)) ). I 

Remark 5.1 Note that bounds obtained in this method are tight, since they are based on 
the actual distribution of the data. 

Remark 5.2 Note that /^n(0|Ainin > 0) / 0. Therefore, the moment of the probabil- 
ity density function /j^^(A|Ainin > 0) does not exist. 



6 Measures of complexity in the non-asymptotic regime 

In some situations one wants to identify the optimal vertex with limited precision. 
The term 

A(t) = -f:«,,log (59) 

in 0, when it is positive, is a kind of "barrier": Ait in equation © must be larger than 
the barrier before Xi can decrease to zero. 

In this section we discuss heuristically the behavior of the barrier /5j (t) as the dynamical 
system flows to the optimal vertex. To this end, we first discuss in rhe following sub-section 
some relevant probabilistic properties of the vertices of polyheders in our ensemble. 



6.1 The typical magnitude of the coordinates of vertices 

The flow © conserves the constraint Ax = b in Let us split these equations according 
to the basic and non-basic sets which corresponding to an arbitrary vertex as 

Abxb + Aj^xj^ = b . (60) 

Precisely at the vertex in question xj^f = 0, of course. However, we may be interested in 
the vicinity of that vertex, and thus leave xj^f arbitrary at this point. 

We may consider H60() as a system of equations in the unknowns xq with parameters 
xj\f, with coefficients Aq^Aj^ and b drawn from the equivariant gaussian ensembles (|13() . 
(|TH) . (|17|) and (fTH|) . Thus, the components of xg (e.g., the Xj-)-fi,_}Tj(t) 's in if we are 
considering the optimal vertex) are random variables. The joint probability density for the 
m random variables x^ is given by Theorem 4.2 of [2ZI (applied to the particular gaussian 
ensembles (O, (d), ^ and ^) as 

r (^) X 
P{xb; xj^) = ^ ^ , (61) 
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where 

X = ^l + x^. (62) 

(Strictly speaking, we should constrain xq to lie in the positive orthant, and thus multiply 
(|6H) by a factor 2™ to keep it normalized. However, since these details do not affect our 
discussion below, we avoid introducing them below.) 

It follows from (|HT|) that the components of are identically distributed, with proba- 
bility density of any one of the components x^j = C, given by 

p{^-^^m) = \^^^ (63) 

in accordance with a general theorem due to Girko j28j . 

The main object of the discussion in this sub-section is to estimate the typical magnitude 
of the m components of xq. One could argue that typically all m components \xBj\ < A, 
since the Cauchy distribution (|63|) has width A. However, from (|63p we have that Prob(|C| > 
A) = 1/2, namely, \xBj\ < A and \xisj\ > A occur with equal probability. Thus, one has to 

be more careful, and the answer lies in the probability density function for R = yx^x^. 



From (|61|) . we find that the probability density function for R = Jx^xb takes the form 



For a finite fixed value of m, this expression vanishes as (i?/A)'"~^ for R « A, attains its 
maximum at 

aJ 

and then and decays like X/R^ for R » A. Thus, like the even Cauchy distribution (|63|1 . 
it does not have a second moment. 

In order to make (|64|) more transparent, we introduce the angle 9 defined by 

tane{R) = ^, (66) 
A 

where < 9 < tt/2. In terms of 9 we have 

n {\xb\ =R) = ^ -^j^ -t cos2 9 sin— 1 9 . (67) 

(In order to obtain the probability density for 9 we have to multiply the latter expression 
by a factor dR/d9 = A/ cos^ 9.) 

Let us now concentrate on the asymptotic behavior of (|67)) (or (|64|) ') in the limit m ^ oo. 
Using Stirling's formula 

r(x) ~ i/^x^e-^ (68) 
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for the large x asymptotic behavior of the Gamma functions, we obtain for m — > oo 



/ 2m 

U{\xb\ = R) ^ cos'^esm"'-^. (69) 

V vrA 

Clearly, H69|) is exponentially small in m, unless sin^ ~ 1, which implies 

9 = 7t/2-6 (70) 

with 5 ~ Thus, writing 

*=\/^ (Tl) 

V m 

(with u « m), we obtain, for m oo. 



n{\xB\ = R)^\ 72 «e-". (72) 

In this regime 

f=tan«.;f»l. (73) 



The function on the r.h.s. of H72() has its maximum at n = 1, i.e., at R/X = yjrnj^ (in 
accordance with ()65() ) and has width of 0(\) around that maximum. However, this is 
not enough to deduce the typical behavior of i?/A, since as we have already commented 
following H65|) . n(|xg| = i?) has long tails and decays like A/i?^ past its maximum. Thus, 
we have to calculate the probability that R > Rq = Atan^O; given Rq. The calculation is 
straight forward: using and H66() we obtain 

TT 

OO 2 

Prob(i? >Ro) = J n{R) dR = J sii^-^ 9 . (74) 

Ro 00 

Due to the fact that in the limit m ^ oo, sin™~^ 6 may be approximated by a gaussian 
centered around 9 = Tr/2 with variance 1/m, it is clear that 

Prob(i? > Ro) = Prob(6l > 6'o) =^ 1 , 

unless 6o = IT /2 — 9q y/2uo/m, with uq << m. Thus, using (|7()|) and (|7T|) we obtain 

Prob(iZ >Ro) = J— f cos'"-^ 5 ~ ^ / e-" = erf(V^^) . (75) 



(76) 





Finally, using the definitions of uq, and i?0; we rewrite (|75]) as 



Prob(i? > Ro) = erf 



fm ( A 

— arctan — 
2 Vi?o 



From the asymptotic behavior erf (x) ~ 1 — e ^ jx^JH at large x, we see that Prob(i2 > Rq) 
saturates at 1 exponentially fast as i?o decreases. Consequently, 1 — Prob(i? > -Ro) ~ 
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0{m^) is not negligible only if Rq/X is large enough, namely, y ^ ^^^^^^\Ti^) — 
Rq/X > \fraj2. If R^jX is very large, namely, R^jX » y/rn/2, which corresponds to 
a small argument of the error function in (|76|) . where we clearly have Prob(i? > Rq) ~ 
yj2rripK{X/ Rq) « 1. From these properties of ((7H|) it thus follows that typically 

^ ~ O(V^) . (77) 

Up to this point, we have left the parameters xj^ unspecified. At this point we select 
the prescribed vertex of the polyheder. At the vertex itself, xj^f = 0. Therefore, from H62|l . 
we see that A = 1. Thus, according to (|77|) . at the vertex, typically 

i^vertex ~ O(V^) ■ (78) 

This result obviously holds for any vertex of the polyheder: any partition (|6fl|) of the system 
of equations Ax = h into basic and non-basic sets leads to the same distribution function 
(|HT|) . and at each vertex we have xj^ = 0. 

Thus, clearly, this means that the whole polyheder is typically bounded inside an n- 
dimensional sphere of radius R ~ 0{^/m) centered at the origin. 

Thus, from (|78|) and from the rotational symmetry of (|61|) . we conclude that any com- 
ponent of xq at the optimal vertex, or at any other vertex (with its appropriate basic set 
B), is typically of 0(-Rvertex/\/"^) = ^(1) (and of course, positive). Points on the poly- 
heder other than the vertices are weighted linear combinations of the vertices with positive 
weights which are smaller than unity, and as such also have their individual components 
typically of 0(1). 



6.2 Non-asymptotic complexity measures from /3j 

Applying the results of the previous subsection to the optimal vertex, we expect the compo- 
nents of xg(t) (i.e., the Xj-|_„-m(i)'s in H59() 'l to be typically of the same order of magnitude 
as their asymptotic values \\m.t—KX)Xis{t) at the optimal vertex, and as a result, we expect 
the barrier (3i{t) to be of the same order of magnitude as its asymptotic value limf^oo (ii{t)- 
Note that, for this reason, in order to determine how the Xi{t) in (jH)) tend to zero, to 
leading order, we can safely replace all the Xj+n-m{t) by their asymptotic values in x*^. 
Thus, in the following we approximate the barrier (j59|) by its asymptotic value 

m m 

Pi = - lim y^ajilogxj+n-mit) = - V Oji logx* , , (79) 

where we have also ignored the contribution of the initial condition. 

We now consider the convergence time of the solution x(t) of © to the optimal vertex. In 
order for x{t) to be close to the maximum vertex we must have Xi{t) < e for i = 1, . . . ,n — m 
for some small positive e. The time parameter t must then satisfy: 

exp(— Ajt + Pi) < e , for i = 1, . . . , n — m. (80) 

Solving for t, we find an estimate for the time required to flow to the vicinity of the optimal 
vertex as 

t> ^ + ^^ ,for alH = l,...,m. (81) 

^i ^i 



19 



We define 

which we consider as the computation time. We denote 

/?max = max (3i . (83) 

In the limit of asymptoticahy small e, the first term in ()82() is irrelevant, and the 
distribution of computation times is determined by the distribution of the Aj's stated by 
Theorem (|I?T|) . 

If the asymptotic precision is not required, the first term in (|82|1 may be dominant. 
To bound this term in the expression for the computation time we can use the quotient 
/3max/Amin, where Amin IS defined in (HTJ)- 

In the probabilistic ensemble used in this work Pmax and Pmax / ^min are random vari- 
ables, as is Ajyjj^. Unfortunately, we could not find the probability distributions of Pmax 
and Pmax/^min analytically as we did for A^m- In the following section, a conjecture 
concerning these distributions, based on numerical evidence, will be formulated. 



7 Scaling functions 

In Section 4 it was shown that in the limit of large (n, m) the probability V{Amin > 
A|A™„ > 0) is given by Consequently, P(A^i„ < A|A„,„ > 0) = ^(">™)(A) is of 

the scaling form 

^("-™)(A) = l-e^Aerfc(2;A) =Hxa). (84) 

Such a scaling form is very useful and informative, as we will demonstrate in what follows. 
The scaling function contains all asymptotic information on A. In particular, one can 
extract the problem size dependence of f^^{0\Amin > 0) which is required for obtaining 
a high probability bound using Lemma 15.11 (This has already been shown in Corollary 
1)5. We use the scaling form, equation (jHU, leading to, 

/it(0|A™„ > 0) = ^:^^^^|a=o = r^A{n,m)^^\x.=o. (85) 

This is just /mmi(0)/Q(0). With the help of lemma I^TTl leading to and our finding 
that T/(n, m) ~ ^/rn, we conclude that with high probability 

^ = O(V^). (86) 



A 



min 



The next observation is that the distribution J-'{x/^) is very wide. For large xa it behaves 

as 1 — , as is clear from the asymptotic behavior of the erfc function. Therefore it does 

not have a mean. Since at xa = the slope dJ^/dxA\x^=o does not vanish, also I/xa does 
not have a mean (see Remark (|5.2|) ). 

We would like to derive scaling functions like ()84() also for the barrier Pmax, that is the 
maximum of the Pi defined by H79|) and for the computation time T defined by (|82() . The 
analytic derivation of such scaling functions is difficult and therefore left for further studies. 
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Their existence is verified numerically in the next section. In particular for fixed r = m/n, 
we found that 

V 



and 



where fdmax and T are the maximal barrier and computation time. The scaling variables 
are ^ 

x/3 = V/3{n,m)- (89) 

and ^ 

XT = r]T{n,m)j. (90) 

The asymptotic behavior of the scaling variables was determined numerically to be 

r]f3{n, m) m (91) 

and 

r]T{n,m) ~ mlogm. (92) 

This was found for constant r. The precise r dependence could not be determined nu- 
merically. The resulting high probability behavior for the barrier and computation time is 
therefore: 



/3max = 0{m), T = 0{m log m) (93) 

Note that scaling functions, such as these, immediately provide the average behavior as 
well (if it exists) 

Here, in the calculation of the distribution of computation times it was assumed that 
these are dominated by the barriers rather then by | loge| in The results (|S7j) . (|5S)) 

and (|93]) are conjectures supported by the numerical calculations of the next section. 



8 Numerical simulations 

In this section the results of numerical simulations for the distributions of LP problems are 
presented. For this purpose we generated full LP instances {A, b, c) with the distribution 
(|12|) . For each instance the LP problem was solved using the linear programming solver of 
the IMSL C library. Only instances with a bounded optimal solution were kept, and A^ain 
was computed relative to the optimal partition and optimality was verified by checking that 
■^min > 0. Using the sampled instances we obtain an estimate of jr("'"^)(A) = V{Amin < 
A\Amin > 0), and of the corresponding cumulative distribution functions of the barrier 
Pma.x and the computation time. 

As a consistency verification of the calculations we compared V{Amin < A\Amin > 0), 
to V{Amini < A\Amini > 0) that was directly estimated from the distribution of matrices. 
For this purpose we generated a sample of A and c according to the probability distributions 
(|13I15() with (7 = 1 and computed for each instance the value of A^mi (the minimum over 
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Figure 1: Comparison of V{Amini < A|A,„mi > 0) and V{A^i^ < AjAmin > 0) for 
m = 2, n = 4. 
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A 



Figure 2: 

jr{n,m)^^-^ for m = 4,20,40,80,120, n = 2m. The number of instances was 
10^, 10^,40000, 15000,5800 respectively. There is very good agreement with the analytical 
results, improving as m increases. 
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Figure 3: T{xa.) is plotted against the variable xa, for the same data as Figure 2. 
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Figure 4: as a function of the variable = m/ f3max for the same instances as 

Figure 2. 
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Figure 5: J^i/t{xt) as a function of the variable xt = mlogm/T for the same instances 
as Figure 2. 

Aj) for a fixed partition of A into (A^, B). We kept only the positive values (note that the 
definition of A^ini does not require b). The two distributions are compared in Figure 1, 
with excellent agreement. 

Note that estimation of V{Amini < A|Ammi > 0) by sampling from a fixed partition 
is infeasible for large m and n, since for any partition of A the probability that Ammi 
is positive is 2^^"^™^ (equation (HU). Therefore the equivalence between the probability 
distributions of Amin and A^mi cannot be exploited for producing numerical estimates of 
the probability distribution of Amin- Thus we proceed by generating full LP instances, and 
solving the LP problem as described above. 

The problem size dependence was explored while keeping the ratio n/m fixed or while 
keeping m fixed and varying n. In Figure 2 we plot the numerical estimates of ^("'™)(A) 
for varying problem sizes with n/m = 2 and compare it with the analytical result, Equation 
(|84j) . The agreement with the analytical result improves as m is increased, since it is an 
asymptotic result. The simulations show that the asymptotic result holds well even for 
m = 20. As in the analytical result, in the large m limit we observe that ^("■'"^)(A) is not 
a general function of n, m and A, but a scaling function of the form ^("■''")(A) = J-{xi^) 
as predicted theoretically in Section 7 (see 1)84(1 there). The scaling variable x^{ra) is given 
by (|Hn|) . Indeed, Figure 3 demonstrates that ^("■'™) has this form as predicted by Equation 
(|84() with the scaling variable x^. 

For the cumulative distribution functions of the barrier /3max and of the computation 
time T we do not have analytical formulas. These distribution functions are denoted by 
•^i/zs"^ and J-y^^ respectively. Their behavior near zero enables to obtain high proba- 
bility bounds on /3max and T, since for this purpose we need to bound the tails of their 
distributions, or alternatively, estimate the density of l//3max and l/T at 0. In the numer- 
ical estimate of the barrier we collected only positive values, since only these contribute to 
prolonging the computation time. From Figure 4 we find that J-i^'I^^ is indeed a scaling 

J-/Pmax 

function of the form ()87|) with the scaling variable xp of (|89|) . The behavior of the compu- 
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tation time is extracted from Figure 5. The cumulative function is found to be a 

scaling function of the form (|88|) with the scaling variable xt of (|9()|) . The scaling variables 
Xj3 and XT were found numerically by the requirement that in the asymptotic limit the 
cumulative distribution approaches a scaling form. Such a fitting is possible only if a 
scaling form exists. We were unable to determine the dependence of the scaling variables 
X/3 and XT on n/m. 

9 Summary and discussion 

In this paper we computed the problem size dependence of the distributions of parameters 
that govern the convergence of a differential equation (Eq.©) that solves the linear pro- 
gramming problem [S]. To the best of our knowledge, this is the first time such distributions 
are computed. In particular, knowledge of the distribution functions enables to obtain the 
high probability behavior (for example (jHHI)) and H93() ). and the moments (if these exist). 

The main result of the present work is that the distribution functions of the convergence 
rate, Amm, the barrier Pmax and the computation time T are scaling functions; i.e., in the 
asymptotic limit of large {n,m), each depends on the problem size only through a scaling 
variable. These functions are presented in section 7. 

The scaling functions obtained here provide all the relevant information about the dis- 
tribution in the large (n, m) limit. Such functions, even if known only numerically, can be 
useful for the understanding of the behavior for large values of (n, m) that are beyond the 
limits of numerical simulations. In particular, the distribution function of A^in was calcu- 
lated analytically and stated as Theorem 1)4.1(1 . The relevance of the asymptotic theorem 
for finite and relatively small problem sizes (n, m) was demonstrated numerically. It turns 
out to be a very simple function (see (|84() ). The scaling form of the distributions of Pmax 
and of T was conjectured on the basis of numerical simulations. 

The Faybusovich flow 6 that is studied in the present work, is defined by a system of 
differential equations, and it can be considered as an example of the analysis of convergence 
to fixed points for differential equations. One should note, however, that the present system 
has a formal solution (jHJ, and therefore it is not typical. 

If we require knowledge of the attractive fixed points with arbitrarily high precision (i.e., 
e of ()8U|) and ()82() can be made arbitrarily small), the convergence time to an e- vicinity of 
the fixed point is dominated by the convergence rate Amin- The barrier, that describes the 
state space "landscape" on the way to fixed points, is irrelevant in this case. Thus, in this 
limit, the complexity is determined by (|86|1 . This point of view is taken in [12 . 

However, for the solution of some problems (like the one studied in the present work), 
such high precision is usually not required, and also the non-asymptotic behavior (in e) 
of the vector field, as represented by the barrier, has an important contribution to the 
complexity of computing the fixed point. 

For computational models defined on the real numbers, worst case behavior can be ill 
defined and lead to infinite computation times, in particular for interior point methods for 
linear programming jl.Sj . Therefore, we compute the distribution of computation times for 
a probabilistic model of linear programming instances rather then an upper bound. Such 
probabilistic models can be useful in giving a general picture also for traditional discrete 
problem solving, where the continuum theory can be viewed as an approximation. 
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A question of fundamental importance is how general is the existence of scaling distribu- 
tions. Their existence would be analogous to the central limit theorem and to scaling in 
critical phenomena PU] and in Anderson localization |311 132| . Typically such functions are 
universal. In the case of the central limit theorem, for example, under some very general 
conditions one obtains a Gaussian distribution, irrespectively of the original probability 
distributions. Moreover it depends on the random variable and the number of the original 
variables via a specific combination. The Gaussian distribution is a well known example 
of the so-called stable probability distributions. In the physical problems mentioned above 
scaling and universality reflect the fact that the systems becomes scale invariant. 

A specific challenging problem still left unsolved in the present work is the rigorous 
calculation of the distributions of l/Pmax and of 1/T, that is proving the conjectures con- 
cerning these distributions. This will be attempted in the near future. 

A The Faybusovich vector field 

In the following we consider the inner product < ^,rj >x-i= S.'^^^^V^ This inner product 
is defined on the positive orthant IR" = {x £ JRJ^ : Xj > 0, i = 1, . . . ,n}, where it defines 
a Riemannian metric. In the following we denote by a*, i = l,...,m the rows of A. 
The Faybusovich vector field is the gradient of h relative to this metric projected to the 
constraint set 0. It can be expressed as: 

m 

gmdh = Xc-Y,Cii^)^o,\ (94) 

i=l 

where Ci{x) , . . . , Cm{x) make the gradient perpendicular to the constraint vectors, i.e. 
A grad h = 0, so that Ax = 6 is maintained by the dynamics. The resulting fiow is 

fix 

— = F(x) = giadh (95) 
at 

Consider the functions 

m 

■^i{x) = log(xi) + ^aji\og{xj+n-m) i = l...n-m. (96) 
i=i 

The are defined such that their equations of motion are easily integrated. This gives 
n — m equations which correspond to the n — m independent variables of the LP problem. 
To compute the time derivative of we first find: 

1 m 

V^i = —e' + J2 e^'+"-"^ , (97) 

and note that the vectors /i* defined in equation © have the following property: 
< >= 0, i = 1, . . . ,n — m, j = 1, . . . ,m 
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Therefore: 

^iix) = <V^i{x),x >=<V^i{x),gra.dh > (98) 

m 

= <fi\c>=-Ai 
This equation is integrated to yield: 

x,{t) = x,(0) exp (-Ait - f; a,, log ^i+I^zlM] . (99) 

\ j=i Xj-^-n-m[0) I 

B The probability distribution P{u) 

In this Appendix we study the probability distribution function 



zT{BTB)-^z + 1J ' 

defined in ()3U|) and H31|) and calculate it in detail explicitly, in the large n, m limit. 
We will reconstruct P{u) from its moments. The A^-th moment 



k]\f = / duP{u) u 



N 





of P{u) may be conveniently represented as 

m?+m CO 



I ^,.r. , " / B (101) 

^"^'^ > f^{^ 

where in the last step we have performed Gaussian integration over z. 

Recall that P{u) is independent of the arbitrary parameter A (see the remark preceding 
(|HH) ). Thus, its A^-th moment /ctv niust also be independent of A, which is manifest in 
()1U1|) . Therefore, with no loss of generality, and for later convenience, we will henceforth 
set X = m (since we have in mind taking the large m limit). Thus, 



ydet(l + -^ 



r(Ar)7 J .[ZJTT^ 

CO 



T{N) J ^ \m 



t''-'e-'i;(-] dt, (102) 
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where we have introduced the function 



vr 



m = [-] <r B , (103) 



det (l + 



y 



Note that 

V'(O) = 1 . (104) 
The function ipiv) is weU-defined for y >0, where it clearly decreases monotonically 

V^'(y)<0. (105) 

We would like now to integrate over the rotational degrees of freedom in dB. Any real 
m X m matrix B may be decomposed as |24| I25j 

B = Oln02 (106) 

where 0x^2 £ 0{m), the group of m x m orthogonal matrices, and Q = Diag(ix'i, . . . ,ujm), 
where wi, . . . , Urn are the singular values of B. Under this decomposition we may write the 
measure dB as |24| I25j 

dB = dfM{0i)dfi{02) n - , (107) 

i<j 

where dfi{Oi^2) are Haar measures over the appropriate group manifolds. The measure dB 
is manifestly invariant under actions of the orthogonal group 0{m) 

dB = d{BO) = d{0'B) , 0,0' e 0{m) , (108) 

as should have been expected to begin with. 

Remark B.l Note that the decomposition ()1U6() is not unique, since OiP and 1^02, with 
D being any of the 2*" diagonal matrices Diag (±1, • • • , ±1), is an equally good pair of 
orthogonal matrices to be used in H106|) . Thus, as Oi and O2 sweep independently over 
the group 0{m), the measure (|107|) over counts B matrices. This problem can be easily 
rectified by appropriately normalizing the volume Vm = / dn{0i)dn{02)- One can show 
that the correct normalization of the volume is 

m(m+l) 

Vm = N / A • (109) 

2^07=1 r(i + i)r(i) 

One simple way to establish (|109j) . is to calculate 

00 

dB exp --tvB'^B = (27r)'^ =Vm (F^uj J| - ^Jl exp - - . 
The last integral is a known Selberg type integral P5| . 
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The integrand in ()1U3|) depends on B only through the combination B^B = 02^'^02- 
Thus, the integrations over Oi and O2 in H1U3|) factor out trivially. Thus, we end up with 



„2 



m\ 2 



IT 



m =Vmi- ^"7 ' " - e-^'^'^^ . (110) 



det (1 + 



It is a straight forward exercise to check that (|in9jl is consistent with V'(O) = 1- 

Note that in deriving ()110|) we have made no approximations. Up to this point, all our 
considerations in this appendix were exact. We are interested in the large n, m asymptotic 
behavior^ of P{u) and of its moments. Thus, we will now evaluate the large m behavior 
of ip{y) (which is why we have chosen A = m in ()102() 1. This asymptotic behavior is 
determined by the saddle point dominating the integral over the m singular values coi in 
IjllOj) as m — > cxD. 

To obtain this asymptotic behavior we rewrite the integrand in (|110|1 as 



Met (1 + ^) 
where 

m -, 

S = mY.^f-^j:^ogiu;f-u;]f. (Ill) 

i=l i<j 

In physical terms, S is the energy (or the action) of the so-called "Dyson gas" of eigenvalues, 
familiar from the theory of random matrices. 

We look for a saddle point of the integral in (|11()|) in which all the coi are of 0(1). In 
such a case, S in (jlllj) is of 0{m?'), and thus overwhelms the factor 



/det(l + ^) 
where 



1 / \ 

/(y) = -^log 1 + 4 (112) 

is a quantity of 0{m^). For later use, note that 

1(0) = 0. (113) 

Thus, to leading order in 1/m, ip{y) is dominated by the well defined and stable saddle 
point of S, which is indeed the case. 

Simple arguments pertaining to the physics of the Dyson gas make it clear that the 
saddle point is stable: The "confining potential" term l|lllj^ tends to condense 

all the LJi at zero, while the "Coulomb repulsion" term — J2i<j ^og{u)f — lOj^ acts to keep 
the l^il apart. Equilibrium must be reached as a compromise, and it must be stable, 

^Recall that m and n tend to infinity with the ratio 141II . r — m/n, kept finite. 
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since the quadratic confining potential would eventually dominate the logarithmic repulsive 
interaction for uJi large enough. The saddle point equations 



dS ^ 
— — = 2uJi 

OUJi 



ni 



... ujf - Uj'j 



(114) 



are simply the equilibrium conditions between repulsive and attractive interactions, and 
thus determine the distribution of the \'jJi\. 

We will solve pi4() (using standard techniques of random matrix theory), and thus 
will determine the equilibrium configuration of the molecules of the Dyson gas in the next 
appendix, where we show that the m singular values uji condense (non uniformly) into the 
finite segment (see Eq. H141() ) 

< < 2 

(and thus with mean spacing of the order of 1 jm) . 

To summarize, in the large m limit, V'(y) is determined by the saddle point of the energy 
S (fTTTj) of the Dyson gas. Thus for large m, according to (HTUl), (fTTTj) and ((TT^ . 

™2 



' exp-(5. + ^/.(y)) , 



where is the extremal value of (|111|) . and /*(y) is (|112|) evaluated at that equilibrium 
configuration of the Dyson gas, namely, 

^(y) = -Ei°g 1 + 4 • (115) 

The actual value of 5* (a number of Oirn}^) is of no special interest to us here, since 
from (|lfl4|) and (|113j) we immediately deduce that in the large m limit 

^{y) ~ e^f . (116) 
Substituting (|116p back into I|1U2|) we thus obtain the large (n, m) behavior of as 

oo 

^-^m/ t^-^e--f^*(;^)dt. (117) 
The function /*(y) is evaluated in the next Appendix, and is given in Eq. H145() . 



h{y) = -y + \ly^ + 2y + log (^y + l + ^jy^ + 2y 

which we repeated here for convenience. 

The dominant contribution to the integral in H117() comes from values of t « m, since 
the function 

= i + T^* f-) , (118) 
2 \m J 
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which appears in the exponent in ()117() is monotonously increasing, as can be seen from 
H144|) . Thus, in this range of the variable t, using (|146() . we have 



Ht) = t+'^hi^] =2^2^+^ + Oi^] . (119) 



Note that the term t/2 in (|119|) is beyond the accuracy of our approximation for I^,. The 
reason is that in ()142() we used the continuum approximation to the density of singular 
values, which introduced errors of the orders of 1/m. Fortunately, this term is not required. 
The leading order term in the exponential (|119j) of l|117|) is just \/2mt. Consequently, in 
the leading order H117() reduces to 

oo oo 



(120) 







2r(2iV) (2A^-1)!! 



(2m)^r(A^) 

The moments H12U() satisfy Carleman's criterion |331 134j 

oo 

E ^'n'"'' = OO , (121) 

N=l 

which is sufficient to guarantee that these moments define a unique distribution P{u). 

Had we kept in 1)120(1 the 0{mP) piece of (|119|) . i.e., the term t/2, it would have produced 
a correction factor to (|120|1 of the form 1 + 0{N'^ /m). To see this, consider the integral 

oo oo 

J {2m)^T{N) J ^ ^ 



oo 

j y'^^-^e-y (l - y'^/Am + ■ ■ ) dy . 



T{N) J (2m)^r(iV) 

2 



(2m)^r(iV) 

u 

Thus, we can safely trust (|120|) for moments of order N « ^Jm. 

The expression in ()12())) is readily recognized as the 2iV-th moment of a Gaussian distri- 
bution defined on the positive half-line. Indeed, the moments of the Gaussian distribution 

ff(x; /i) = ^ e-'^''^' , x>0 (122) 



are 



(^') = ■ (123) 



IT fl^ 



In particular, the even moments of ()122|) are 



^i^+k) _ (2iV-l)!! 
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which coincide with (|12U() for = m. These are the moments of u = for the distribution 
P{u) satisfying P{u) du = g{x; ^Jrri/2)dx, as can be seen comparing ()120() and (|124() . 

Thus, we conclude that the leading asymptotic behavior of P{u) as m tends to infinity 

is 

the result quoted in 

As an additional check of this simple determination of P{u) from ()12U() . we now sketch 
how to derive it more formally from the function 

oo 

au)=[^. (120) 

J z — u 



known sometimes as the Stieltjes transform of P{u) G{z) is analytic in the complex 
z-plane, cut along the support of P{u) on the real axis. We can then determine P{u) from 
H126|) . once we have an explicit expression for G{z), using the identity 

P{u) = - Im G{u - ie) . (127) 

TT 

For z large and off the real axis, and if all the moments of P{u) exist, we can formally 
expand G{z) in inverse powers of z. Thus, 



N=0 i N=0 ^ 



For the /cat's given by ()12U|) . the series H128|) diverges. However, it is Borel summable [SSI- 
Borel resummation of (|12()|) . making use of 

1 ^ (2A^-1)!! /x^^ 

-1+ E 



yields 



Thus, 



G{z) = - / ; „, . (129) 



z J . /i _ Jl 

mz 



-ImG(u-ie) = — / = J e"" , 130 

TT ^ ^ TTU J , l2t_ - I y 27rU ' ^ ^ 



mu 
2 



which coincides with (IT2 
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C The saddle point distribution of the Ui 



We present in this Appendix the solution of the equihbrium condition H114|) of the Dyson 
gas of singular values 

|| = ".-E;^ = « (131) 

(which we repeated here for convenience), and then use it to calculate I*{y), defined in 
(|115|) . We follow standard methods j2,S| I24j of random matrix theory Let 

Si = 00^ , (132) 

and also define 

1 ™ 1 1 1 

F(u;) = -V( ) = -(tr -^), (133) 

^ ^ m^,w-Si w-B^B' ^ ' 

where w is a complex variable. Here the angular brackets denote averaging with respect to 
the B sector of ()13() . By definition, F{w) behaves asymptotically as 

F{w) — > - . (134) 

W—^00 W 

It is clear from (|133|) that for s > 0, e ^ 0+ we have 

Fis - ^e) = -P.P. E(73t) + ^ E('^(- " ^0) (135) 



1=1 * 1=1 



where P.P. stands for the principal part. Therefore (from (|133() '). the average eigenvalue 
density of B^B is given by 



1 1 



p(s) = — Y{6(s - Si)) = - Im F(s - ie) . (136) 



m ^ TT 

1=1 



In the large m limit, the real part of (|135|) is fixed by (|131|1 . namely, setting s = Sj, 

ReF(,-.).i(i:^) = l. (137) 

From the discussion of physical equilibrium of the Dyson gas (see the paragraph preced- 
ing (|114|) ). we expect the {sj} to be contained in a single finite segment < s < a, with a 
yet to be determined. This means that F{'w) should have a cut (along the real axis, where 
the eigenvalues of are found) connecting w = and a. Furthermore, p{s) must be 

integrable as s — > 0+, since a macroscopic number (i.e., a finite fraction of m) of eigenvalues 
cannot condense at s = 0, due to repulsion. These considerations, together with (|137j) lead 
[231 124j to the reasonable ansatz 



F{w) = 1 + 



^w{w-a), (138) 

with parameters p and q. The asymptotic behavior H134|l then immediately fixes 

g = 0,|j = -l,anda = 2. (139) 
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Thus, 

The eigenvalue distribution of B is therefore 



F{w) = l-\l- . (140) 

w 



p{s) = -lin F{s-ie) = -\j- — - (141) 
vr TT V s 

for < s < 2, and zero elsewhere. As a simple check, note that 

2 

j p{s)ds = 1, 



as guaranteed by the unit numerator in 1)1341) . 

Thus, as mentioned in the previous appendix, ojf, the eigenvalues of B^B, are confined 
in a finite segment < s < 2. In the limit m ^ oo, they form a continuous condensate in 
this segment, with non uniform distribution ()141)) . 

In an obvious manner, we can calculate S^:, the extremal value of S in ()111() . by replacing 
the discrete sums over the Si by continuous integrals with weights p{s) given by (|14H) . We 
do not calculate explicitly, but merely mention the obvious result that it is a number of 
0{m?'). Similarly, from H115|) and H141|) we obtain 

2 2 



h{y)= J p{s)log^l + y^j ds = -J ^—^log^l + y^j ds. (142) 


Since the continuum approximation for p{s) introduces an error of the order 1/m, an error 
of similar order is introduced in I^. It is easier to evaluate and then integrate back, 

to obtain h{y). We find from (fTl^ 



^?^lM = _^(_,) = _l + ^±^ = _l, (143) 

It is clear from the last equality in (|143() that 

^ > (144) 
dy 

for y > 0. Integrating (|143() . and using (|113() . /*(0) = 0, to determine the integration 
constant, we finally obtain 



I* (y) = -y + V + 2y + log (^y + 1 + V + 2y j • (145) 

From H145|) we obtain the limiting behaviors 

/,(y) = 2v/2^-y + 0(y3/2), 0<y«l, (146) 

and 

I,{y) =log(^) +0 (-) , y»l. (147) 



Due to (|143|) . I*{y) increases monotonically from /^.(O) = to its asymptotic form H147|) . 
Note that for y = t/m (as required in (jllTj) '). the second term in ()146|) is 0{\/m) and 
therefore it is beyond the accuracy of the approximation of this section. 
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